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A Tradeoff between Secrecy and Reliability 
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Abstract — Equivocation rate has been widely used as an 
information-theoretic measure of security after Shannon[12]. It 
simphfles problems by removing the effect of atypical behavior 
from the system. In [11], however, Merhav and Arikan consid- 
ered the alternative of using guessing exponent to analyze the 
Shannon's cipher system. Because guessing exponent captures 
the atypical behavior, the strongest expressible notion of secrecy 
requires the more stringent condition that the size of the key, 
instead of its entropy rate, to be equal to the size of the message.' 
The relationship between equivocation and guessing exponent 
are also investigated in [8] [9] but it is unclear which is a better 
measure, and whether there is a unifying measure of security. 

Instead of using equivocation rate or guessing exponent, we 
study the wiretap channel in [2] using the success exponent, 
defined as the exponent of a wiretapper successfully learn the 
secret after making an exponential number of guesses to a 
sequential verifier that gives yes/no answer to each guess. By 
extending the coding scheme in [2][6] and the converse proof 
in [4] with the new Overlap Lemma V.2, we obtain a tradeoff 
between secrecy and reliability expressed in terms of lower 
bounds on the error and success exponents of authorized and 
respectively unauthorized decoding of the transmitted messages. 
From this, we obtain an inner bound to the region of strongly 
achievable public, private and guessing rate triples for which 
the exponents are strictly positive. The closure of this region is 
equivalent to the closure of the region in Theorem 1 of [2] when 
we treat equivocation rate as the guessing rate. However, it is 
unclear if the inner bound is tight. 
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I. Introduction 

The basic model of a cryptographic/secrecy system involves 
a sender Alice who wants to send a message S as secretly as 
possible to the intended receiver Bob. The basic model of 
a cryptanalytic attack, on the other hand, involves a crypt- 
analyst/wiretapper Eve who attempts to learn the secret as 
much as possible based on her observation Z. How secretly 
a message is sent, or how much information is leaked, must 
therefore be quantified before one can design and optimize a 
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'This is the condition for a finite system to achieve perfect secrecy as 
pointed out by Shannon[12]. 
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Fig. 1. Genie-aided correction channel 



cryptographic system or a cryptanalytic attack for the respec- 
tive purposes. 

The aposteriori probability function Ps|z is a sufficient 
statistics of the security of the system as it gives all the 
possible values of the secret and their associated probabilities 
for every possible realization of the wiretapper's observation. 
In particular, the important notion of a system being perfectly 
secure, referred to as perfect secrecy by Shannon[12], can 
be characterized as the aposteriori probability equal to the 
prior, i.e. Ps|z — Ps- In other words, Eve's observation is 
independent of the secret, or equivalently, the system is at the 
same level of security whether Z is observed or not. 

It is convenient to summarize the aposteriori probability 
function by the index called equivocation H{S\Z). It is roughly 
the amount of information the wiretapper needs to gather in 
addition to Z to perfectly recover S. One precise operational 
meaning of equivocation, as illustrated in Fig. 1, is the 
minimum achievable rate for source coding an iid sequence 
of S^"^ with the iid sequence of Z'"' as side information at 
the decoder."^ To achieve perfect secrecy, it is necessary and 
sufficient to have iJ(S|Z) = H{S). Alice can also try to protect 
the secret up to an equivocation H{S\Z) below H{S) if perfect 
secrecy is costly and unnecessary. 

The amount of additional information Eve needs to gather 
to break the system may not reflect how difficult it is to obtain 
them. For example, getting just one bit of information from 
Alice or someone who know the secret may require significant 
effort in the search for that person, followed by lengthy 
interrogation. In some situations. Eve does not play a passive 
role of receiving additional information that is concisely stated 
(i.e. maximally compressed by a genie), but instead plays an 
active role in identifying and extracting relevant information 
from disorganized sources. Thus, one should question whether 
equivocation is applicable for the case of interest, albeit its 
mathematical convenience. 

A natural alternative measure of security, as investigated 

^This is the coiTection data model oiiginally proposed by Shannon[12] 
except that the genie does not need to know Z nor any decision feedback 
from Bob. 
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by Merhav and Arikan[ll], is roughly the ability that Eve 
perfectly learn the secret from yes/no answers to "Is the secret 
equal to ...?" type of questions. In the model. Eve sequentially 
verify her guesses of the secret by asking yes/no questions. The 
number of guesses and verifications she needs to make until 
she is within some probability of guessing the secret correctly 
indicates her effort and ability to extract information about the 
secret. Sometimes the system itself provides such a verifier 
which help correct careless mistakes made by the authorized 
user This potentially leaks information to unauthorized users 
who also have access to the verifier, just as in the case of a 
login system. As a system designer, he may be interested to 
know how many wrong passwords should be allowed for each 
session so that the chances of successfully breaking into the 
account is reasonably small. Although this success probability 
does not have a way to express the notion of perfect secrecy in 
general (See Example A. I ), it is a natural fit for this problem 
as it provides the number of trials as an additional parameter 
to optimize. 

In the sequel, we will consider the wiretap channel problem 
in [2]. A key result from [2] is the single letter characterization 
of the secrecy capacity, defined as the maximum rate at which 
the secret can be transmitted to Bob by a block coding scheme 
with arbitrarily small error probability and the equivocation 
rate equal to the message rate. Transmitting at rate above this 
secrecy capacity, one faces the trade-off a lower equivocation 
rate. Transmitting at rate below the secrecy capacity, however, 
equivocation rate is capped at the message rate. There seems 
to be little point in further reducing the rate below secrecy 
capacity. If one also cares about delay, i.e. how fast the error 
probability converges to zero, further reducing the rate below 
secrecy capacity can be beneficial. What is the tradeoff then? 

Secrecy comes with a cost of reliability of the authorized 
decoding. To characterize which level of secrecy and reliability 
are simultaneously achievable for each rate, we will use 
the standard notion of error exponents for Bob and Eve 
in decoding their messages as a measure of reliability. For 
secrecy, we will use the exponent of the success probability, 
or success exponent for short, that Eve learns the secret within 
an exponential number of guesses. 

The rest of the paper will be organized as follows. Sec- 
tion III defines the wiretap channel problem we consider. 
Section IV describes the proposed coding scheme. Section V 
explains the computation of the success exponent using a 
technique we call the Overlap Lemma V.2. Section VI explains 
the computation of the error exponents using the Packing 
Lemma[3]. Finally, the desired lower bounds on the exponents 
will be stated in Section VII. Section VIII gives the conclusion 
and some open problems. For readers who would like to skip 
to the main result. Section II provides a brief summary of 
notations. 

II. Preliminaries 

Calligraphic font denotes a set, e.g. A, which is always 
assumed finite unless otherwise stated. 2-^ and A'^ denote the 
power set and complement of A respectively. A^ B, Ar\B 
and ^ \ ;B denotes the usual set operations, which are the 
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Fig. 2. Wiretap channel model 

union, intersection, and difference respectively. Avg^^g^ (or 
Avgjj for short) denote the averaging operation ^ TlaeA- 

and Z+ denotes the set of real numbers, non-negative 
real numbers, and positive integers. Occasionally without 
ambiguity, a positive integer L will also be used to denote 
the set {1, . . . , L} as in Z G L. Bold letter such as x denotes 
an n-sequence {a;^*-'}^^]^ — (a;(^\ . . . jX*^"^); and uox denotes 



element-wise concatenation {(', 



)}?=!■ 



San serif font is used for random variables and stochastic 
functions, e.g. X, f and Wf,. S^iy)"^ denotes the set of all 
possible conditional probability distributions Py|x of a random 
variable Y taking values from y, denoted as Y G 3^, given 
a random variable X G A:". The (conditional) probability 
distribution will also be viewed as a row vector (matrix), 
e.g. PxP\\x denotes the matrix multiplication, which gives 
the marginal distribution Py- ^'x ° Py\x denotes the direct 
product, which gives the joint distribution _Px,y of the pair 
(X,Y) in this case. denotes the n-th direct product 
such that Px{x) = 0"=! Py^i^i)- For any subset A C X, 
Px{A) = J^xeA^^i^)- E(X) denote the expectation of X. 
Svai {P, Q) denotes the variation distance (25) between P and 
Q. 

Following the notations in [3] for the method of types, 
Px and Py\x denotes the type (6) and respectively canonical 
conditional type (8). 'Canonical' refers to the constraint (for 
convenience) that Py^x{y\x) = l/\y\ if Px{x) = for all 
{x,y) G X X y. Tq"^ or Tq for short denotes the class of n- 
sequences of type Q. Tv{x) denotes the F-shell of x. ,^n{X) 
denotes the set of all types for sequences in ^Y". %i{Q,y) 
i%i{Q) or %i for short) denotes the set of all canonical condi- 
tional types V for sequences in y". I{Q, V\P), D{V\\W\Q), 
and H{V\P) are the conditional mutual information (29), 
divergence (10) and entropy (11) respectively. I{x A y) (20) 
denotes the empirical mutual information. Equivalently, we 
write Tx Tp^ and Ty\x '■= 7py|x' which are non-empty if 
the corresponding distributions are valid (conditional) types. 
ITyixI denote \Tp^^^{x)\ with x G Tx- 

To express inequality in the exponent for functions in n, we 
use a„ ^ bn to denote limsup^^o^ ^ loga„ is no larger than 
liminfn^oo ;^log6n- A piecewise function will be expressed 
in terms of |a|+ :— max{0,a} and \a\^ :— min{0, a}. 

III. Problem formulation 

A. Transmission model 

Fig. 2 illustrates a single use of the discrete memoryless 
wiretap channel We) using the dummy random variables 
X, Y and Z. Alice sends a random variable X through the chan- 
nel. Px G !3^{X) is the probability distribution function/vector 
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Fig. 3. Transmission model 



of X over the finite set X, such that Fx = Pr{X = x} {x £ X) 
and Px{A) = Pr{X £A}{Ac. X). 

The channel is denoted by the pair {Wb £ ^{y)'^ , We £ 
3^{Z)^) of conditional probability distributions. We write 
Wf,(X) and We(X) as the channel output Y and resp. Z 
observed by Bob and resp. Eve. The conditional distribution 
PY\x{y\x) ■= Pr{Y = y\X — x} equals Wb{y\x) for all 
{x, y) £ X xy, and similarly for Pz\x- For the case of interest, 
all sets X, y and Z are finite and the correlation between Y 
and Z given X need not be specified. 

To transmit information through this channel, we will con- 
sider the (data) transmission model illustrated in Fig. 3 with 
block length n. Following [2], we consider n uses of the 
channel with stochastic encoding, and deterministic decoders 
at the receivers. As pointed out in [2], stochastic encoding, i.e. 
randomization in the encoder during transmission, increases 
secrecy by adding noise as a physical barrier to eavesdropping 
while deterministic decoding does not lose optimality for the 
case of interest. 

As shown in Fig. 3, Alice chooses a public/common message 
m out of a set of M possible messages to convey to both Bob 
and Eve, and a private/secret/confidential message I £ L only 
to Eve. {I £ L is a. short-hand notation for I £ {1, . . . , L}.) 
Since the message m for Eve is a degraded version of the 
message {m,l) to Bob, this is identical to the asymmetric 
broadcasting of degraded message iefi[3] except for the 
additional secrecy concern. 

In the transmission phase, Alice first passes the message 
through a stochastic encoder denoted by the conditional prob- 
ability distribution / £ .^{X")^'^^. We write f{m,l) as the 
output codeword, which is denoted by the dummy random n- 
sequence X :— {X^*^ }i=i in Fig- 3. The encoder can be viewed 
as an artificial channel, through which the output codeword X 
of the message {m,l) must satisfy Pr{X = x} = f{x\m,l). 
It effectively adds additional noise to make it hard for Eve to 
learn the secret. This artificial noise also affects Bob since he 
does no know it a priori. 

Alice then transmits the random codeword X through n uses 
of the wiretap channel. The n-th extension of the wiretap chan- 
nel is characterized by the n-th direct power {W^\ W^), where 
M/ft"(y|a;) = lTLi^b{y^'^\x^''') and similarly for M/J'. Bob 
uses his channel output Y to decode both the public and private 
messages with a deterministic decoder 0f, : y ^ M x L. 



$b : Af X L 2-^" denotes the decision region so that 

(1) (j)b{y) ^ {m,l) ^ y£^b{m,l) 

Similarly, Eve uses her channel output Z to decode the public 
message with decoder 0e : Z" ^ M and decision region 
$e '■ M h-^ 2^ . She, however, also generates an unordered 
set of A < L distinct guesses of the secret using a list decoder 
ip : Z" H-i- {A C L : \A\ = A}, which is a correspondence. 
The decision region ^ : L ^ 2^ satisfies 



(2) 



I £ ip(z) 



Z £ *(0 



The triple {f,<f>i,,<f>e) will be called an (n-block) wiretap 
channel code, while the list decoder will be called the list 
decoding attack (with deterministic list size). The quadruple 
(/, 4>b,4'e, tp) will be called an (n-block) transmission (model) 
for the wiretap channel. 



B. Achievable rate and exponent triples 

The performance of a wiretap channel code with respect to 
a list decoding attack is evaluated based on the following fault 
events. 

Definition III.l (Fault events). Let £b{m,l), £e{m,l) and 
Se{m,l) be the fault events that Bob decodes {m,l) wrong. 
Eve decodes m wrong, and Eve successfully guesses I respec- 
tively when (m, I) is the pubUc and private message pair. i.e. 



Sbim, I) 
£e{m, I) 
Se{m, I) 



{^,(Wnf(m,0))7^(m,0} 

{^e(W^(f(m,0))^™} 

{l£^iW:ifimjm 



The corresponding (average) fault probabilities (over the mes- 
sage set M X L), Cb, Be and Se can be computed as follows. 



(3a) 
(3b) 
(3c) 



eb= Avg ^b{n{T^,l)\x)f{x\m,l) 



meM,ieL 



xeX" 



Avg V Wl\<^l{m)\x)f{x\m,l) 
Avg V T^,"(*(/)|a;)/(a;|m,/) 



where ^^{m,l) and ^^(to) are the complements of the 
$b(m, Z) and ^e{m) respectively; and Avg^g^^^^^ denotes 
JTT^mGMiGL- When there is ambiguity, we will write 
eb{f,4>b,Wb) etc. to explicitly state its dependencies. 

We study the asymptotic properties when the sizes M and 
L of the message sets and A of Eve's guessing list grow 
exponentially while the fault probabilities decay exponentially 
in n. The exponential rates are defined as follows. 

Definition III.2. Consider a sequence of n-block transmis- 
sions (/^"•', 4'b^\ 0e"'', Tp^'^^) (n £ Z+) over the wiretap chan- 
nel {Wb, We), the public message rate Rm, private message 
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rate Rl and the guessing rate R\ are defined as, 
(4a) Rm ■■= lim inf - log M^") 

n—*oo 71 

(4b) Rl := lim inf ^ log L^") 

(4c) 



1 



Rx := lim sup - log A*^"^ 

n — >oo ^ 

The exponents of the fault probabilities (3) are defined as, 



(5a) 
(5b) 
(5c) 



Eh :— lim inf log e 



(") 



n— ►oo n 

:= liminf--loge(") 

n— i-oo fi 

Se := lim inf --log 4"^ 

n — >oo fi 



In) 

where and alike denotes Cf, evaluated with respect to the 
n-block transmission. For simplicity, the superscript [n) will 
be omitted hereafter if there is no ambiguity. 

In the code design phase prior to the transmission phase, 
Alice chooses {f,(j>b,(j>e) without knowledge of ip and then 
Eve chooses ijj knowing Alice's choice. In particular. Eve 
chooses ip to minimize Se so that her success probability 
si"^ decays to zero as slowly as possible, while Alice chooses 
{f,(j)b,<j>e) to make Eb, E^ and Se large so that the error 
probabilities e["' and ei"'' decay to zero fast for reliability, and 

in) 

the probability s\ ' of successful attack by Eve decays to zero 
fast for secrecy. The tradeoff between secrecy and reliability 
for Alice can be expressed in terms of the set of achievable 
rate and exponent triples defined as follows. 

Definition III.3 (Achievable rate and exponent triples). The 
rate triple (i?i, i?2, R3) e R\, where M+ := {a e M : a > 0}, 
is achievable if there exists a sequence of wiretap channel 
codes {f,(l)b,(l)e) with rates, 

Rm > Rl and Rl > R2 

such that for any sequence of list decoding attack -0 with 
guessing rate R\ < R^, the probabilities eb, Ce and Se 
converge to zero as n 00. 

The exponent triple {Ei, E2, E3) e M.\_ is achievable with 
respect to the rate triple if in addition that, 

Eb > El and Ee > E2 and Se > E3 

If the achievable exponents are strictly positive, the rate triple 
is said to be strongly achievable. 

In the sequel, we will obtain an inner bound to the set 
of achievable exponent triples in the form of parameterized 
single-letter lower bounds, one for each exponent. '' From this, 

'in response to the question of using average instead of maximum eiTor 
probabilities (over the message set), we would like to point out that the 
particular inner bound to be derived also holds when e;, and are defined as 
the coiTesponding maximum eiTor probabilities and Se as the average success 
probability. It follows from the usual argument of successively expurgating 
worst half of the codewords as in [6], which turns out to preserve the desired 
overlap property of the code and hence the bound for the success exponent, 
(see Section V) If one defined Se as the maximum probability however, the 
problem becomes degenerate since there is an obvious strategy for Eve to 
achieve = 1. 



an inner bound to the set of strongly achievable rate triples will 
be obtained, the closure of which coincides with the closure of 
the achievable region in Theorem 1 of [2] when the guessing 
rate is treated as equivocation rate. 

IV. Coding scheme 

The coding scheme (i.e. the specification of the sequence 
of wiretap channel codes {f, (jib, (f>e), see Fig. 3) considered 
here is a merge of the schemes in [2] and [6] using the 
method of types developed by Csiszar[3]. We will describe 
each key component of the code in succession and explain 
how each of them simplifies the analysis of the fault events 
(see Definition III.l). 

A. Constant composition code 

As a first step, output of the stochastic encoder is restricted 
to constant composition code[y\ defined as follows. Let 
N{x\x) denote the number of occurrences of symbol x ^ X 
in the n-sequence x £ X". The type or empirical distribution 
Px of X is defined as the probability mass function. 



(6) 



N{x\x) 



Let i^n{X) := {Px ■ X e X"} denote the set of all possible 
types of an n-sequence in X". The type class Tq"^ :— {x : 
Px — Q} 01 Tq for short denotes the set of all n-sequences 
X having type Q e ^n{X). An n-block constant composition 
code 9 on X is an ordered tuple of codewords all from the 
same type class on X. i.e. 3Q £ ^niX),9 C Tq. 

Suppose 9 is the constant composition code of type Q for 
the stochastic encoder /. Then, f{x\m,l) — for all x ^ 9. 
From (3a), 

(7) eb= Avg VM^b"($g(m,0|c)/(c|m,0 

meM,l£L 

and similarly for other probabilities in (3). To further simplify 
the expressions, define the canonical conditional type Py\x of 
y given x as. 



(8) 



Py\x{y\x) 



•1/13^1 

N{x,y\x,y) 
N{x\x) 



,N{x\x) = 
, otherwise 



for all X £ X,y £ y, where N{x, y\x, y) is the number of oc- 
currences of the pair (x, y) in the n-sequence {(x'*-', y'*-')}"=i 
of pairs. The canonical conditional type of y given x exists and 
is unique by definition.^ However, with a canonical conditional 
type V given x specified, there can be more than one y 
satisfying it.'' If V : X ^ y \s the conditional type of y 
given X, y is said to lie in Tv{x), referred to as the V -shell 
of X or the conditional type class of V given x. In other 

* This is a minor modification of the conditional type defined in Defini- 
tion 1.2.4 of [3], according to which y may have a continuum of conditional 
types V given x since V{y\x^ can be arbitrary when A''(a;|£c) = 0. 

^For example, the binary sequences 1100 and 0011 have the same canonical 
conditional type given 1111, i.e. [.5 .5]. Similarly, 1111 has the same 
canonical conditional type whether it is given 1100 or 0011, i.e. |^0 ij . 
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words, Tv{x) is the set of all y G y" with conditional type 
V given x. 

Writing VFb"(y|c) as the product 11:,.^ ^''(^I^^)^^'"'^''''^^' 
Lemma 1.2.6 of [3] gives, for all y G 7V(c), 

(9a) W^{y\c) = exp{-n[D(V\\Wb\Q) + H{V\Q)]} 
Wl^{Tv{c)\c) 



(9b) 



\Tv{c) 



(9a) is uniform 



where the conditional information divergence D{V\\Wb\Q) 
and conditional entropy H{V\Q) are defined as. 



(10) D{V\\W\Q):^ J2 Qix)Viy\x)\n 



{x.y)exxy 



V{y\x) 
W{y\x) 



(11) H{V\Q):^ V Q{x)Viy\x)ln—^ 

{x,y)exxy ^ 

The key impUcation is that Wl^{y\c) depends on y only 
through the conditional type Py^^. and channel output W^'(c) 
is uniformly distributed within every y-shell Ty(c). 

Let MQ^y) ■■= {Py\^ -.xeTQ^ye 3^"} (MQ) or r„ 
for short) be the set of all possible canonical conditional types 
of y given c. This set depends on c only through the type Q 
of c.^ {TV(c) : V £ i<n{Q)} is a paititioning of y" for every 
c € 9 because every y has a unique canonical conditional type 
given c. We can therefore partition the probabilities by ^(Q) 
as follows. From (7), 



efc 



Avg^ E W^mm,l)nTv{c)\c}f{c\m,l) 



m.l 



where the last equality is due to the piecewise uniform 
distribution of the channel output W^(c) implied by (9). By 
Lemma 1.2.6 of [3],^ 



(12) Wl\Tvic)\c) < exp{-nD{V\\Wb\Q)} 
Thus, 66 can be upper bounded as, 

(13) eb< J2 exp{-nDiV\\Wb\Q)}x 

vey„{Q) 

|$g(m,OnTy(c) 

I meM.leL 



\Tvic)\ 



f{c\m,l) 



B. Transmission of junk data and prefix DMC 

In the previous section, the use of constant composition 
code simplifies the probability (3a) to (13) and similarly for 
other probabilities in (3). In this section, we shall specify 

*For example, if y = Oil is in the V-shell of c = Oil, then permutation 
y' = 110 of y is in the V-shell of the same permutation c' = 110 of c. In 
general, if V is a canonical type of some sequence y £ y" given c G 9 then 
the V-shell of another codeword dad must contain a sequence y' G , 
namely the sequence obtained from y by the same permutation of c £ to 
c! £ 9. Thus, the set of all possible canonical conditional types are the same 
if the conditioning sequences have the same type. 

'The key step in the derivation is that y"(Tv(c)|c) < 1 implies 
|Ty(c)| < exp{n_H'(y|Q)} by (9) with Wi, replaced by V. 



the structure of the stochastic encoder / and its uniform 
randomization over junk data as follows. 

Consider indexing the codewords in as Cji^ by j G J, 
I G L and m G M. i.e. 



(14) 



G {cjim}jeJ,ieL.me 



M 



Set f (to, I) — cjim where the junk data J is a random 
variable Alice chooses uniformly randomly from {1, . . . , J}. 
The conditional probability / is. 



(15) 



f{c\m,l) 



,if C G {Cjirn ■■ j £ J} 

, Otherwise 



This approach of providing secrecy, illustrated in Example A. 2 
in the Appendix, will be called transmission of (uniformly 
random) junk data because J is not meant to be a message 
although it is encoded like one.** Substituting this into the 
upper bound of Cf, in (13) and similarly for the other fault 
probabilities gives the following expressions. 

Lemma IV.l (Constant composition code, transmission of 
junk data). Using n-block constant composition code 9 in (14) 
of type Q G S^(X) and the transmission of junk data approach 
(15), the probabilities in (3) can be upper bounded as follows. 



(16a) ef,< 



(16b) Ce < 



ver„(Q) 



exp{-ni?(F||M^e|Q)}Avg 



(16c) s,<^Gx^{-nD{V\\W,\Q)}k^g 



\'i^l{m)r\Tv{cji^)\ 
|Tv(c,,„)| 

|'&(OnTv(c,,„)| 



j,l,m 



|Tv(c,,„)| 



ver„(Q) 

where Avg^ ; ^ is over j £ J, I £ L and m G M. 

Note that the randomization in the encoder is equivalent to 
the averaging over the message augmented with junk data. 

Another approach of randomization introduced in [2] is 
the prefix discrete memoryless channel (prefix DMC), which 
is characterized by the conditional probability distribution 
V G 3^{X)^ from some finite set X. The stochastic encoder 
first maps (m, I) into an n-sequence in A"", which is then fed 
through the extended prefix DMC V'^ before being transmitted 
through the channel. To combine this with the transmission of 
junk data approach, let / be the original stochastic encoder 
defined in (15) except that X is replaced by X, and 9 \s a. 
constant composition code with type Q on X. Then, the new 
encoder is. 



f{x\m, I) 



cee 



V'^{x\c)f{c\mJ) ym£ M,l e L,x e X"^ 



This is illustrated in Fig. 4(a). 

The prefix DMC can be viewed as part of the wiretap 
channel instead of the encoder as in Fig. 4(b) because the 
wiretap channel {Wb, We) prefixed with any discrete memory- 
less channel V is just another wiretap channel {VWb, VWe), 

*It turns out that J can also be reliably decoded by Bob with lower level 
of secrecy. Thus, one may choose J to be meaningful private data to achieve 
a new notion of unequal security protection. However, it suffices for our case 
of interest to treat J as meaningless. 
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codebook 



{m,l) 





X e AT" 


e 





yn 

1 prefix DMC 

J e J 

trans, of junk 
stochastic encoder f 

(a) original model 



W" 



wiretap channel 



(m,;,J) , 

augmented message 



- |(VWe)' 
prefixed wiretap channel 

(b) equivalent model 
Fig. 4. Stochastic encoding with transmission of junk data and prefix DMC 



where the product VWb is the matrix multiplication. Thus, 
any performance metric, say e{Wb,We), that one obtains 
without prefix discrete memoryless channel can be converted 
to the performance metric with prefixing discrete memoryless 
channel as e{VWb,VWe). 

Because of this simplicity in extending any performance 
metrics with prefix DMC, we will leave this prefixing pro- 
cedure to the very end and use the encoder defined in (15) 
for the main analysis. For a simple comparison between the 
prefix DMC and transmission of junk data approach, readers 
can refer to Example A.2 and A.3 in the Appendix. 

C. Random code construction and MMI decoding 

As a summary, encoder / encodes the public and private 
messages m and respectively I, and the junk data J into a 
codeword cum in the constant composition code 9 of type Q. 
The codeword is then transmitted through the wiretap channel 
{Wb, We), to which a prefix a DMC {V} will be added in the 
end. The fault probabilities simplify to (16), with (tVfc,T4^e) 
replaced by (VWb, V We) for the prefix DMC. It remains to 
specify how the codebook 9 and decoders {(pb, (t>e) should be 
constructed. 

Csiszar and Korner[2] consider maximal code construction 
with typical set decoding for the wiretap channel. This cannot 
be used here since typical set decoding fails to give exponential 
decay rate for the error probabilities. We will adopt the random 
code construction scheme with maximum mutual information 
(MMI) decoding in [6] instead. 

As a preliminary for the random code construction, some 
finite set U is chosen. The wiretap channel is trivially ex- 
tended with an additional input symbol from U to {Wb £ 
g^iyfixx ^^^^ e ^{Z)^''^), where 

Wb{y\u,x) := Wb{y\x) 
We{y\u,x) := We{y\x) 

for all (u, x) X X. In the form of the stochastic transition 
function, \Nb{u,x) := Wf,(a;) and \Ne{u,x) := We(a;), which 
means that the extended channel simply ignores the additional 



(17) 



input symbol. Thus, this trivial extension is purely conceptual 
and does not change the original problem. 

As the first step in the random code construction, a type 
Qq e [^niU) on U is chosen for the constraint length n. 
Then, each of the set 6o {^m}rneM °f n-sequences is 
uniformly randomly and independently (u.i.) chosen from the 
type class Tq^. i.e. 







, otherwise 



Vm e M 



Next, a conditional type Qi (e i^niQa, X)) is chosen. For 
each Um generated, consider its Qi-shell Tq^(U,ti). Each of 
the set 8i(m) := {'^jhn}je.J,ieL of n-sequences is chosen 
u.i. from TQ^ {Um)- i-e. 




,xe Tq,{u) 
, otherwise 



for all (j, I, m) e J X L X Af, u e Tq^. 

Finally, U™ := {U^m}f=i and Xjim 
combined into one codeword Cjim ■— Um o Xjim, where o 
denotes the element-wise concatenation, i.e. 



:= {xgjLi are 



(18) 



The i-th term C 



.x,,,„ = {(uw,x(i)}^ 

,(0 v(0 



-Jim ■ (Um , IS transmitted in the i- 

th use of the (extended) wiretap channel. The random code 
is defined as the ordered structure {Cj/ml^gjig^ ,„g^/- 
Its type is denoted as Q G ^nil^,X) where Q{u,x) := 
Qoiu)Qi{x\u) {{u,x) e U X X). We write Q = Qa ° Qi 
where o denotes the direct product. 

Definition IV.l (Random code). The random code 6 of type 

Q ■■= Qo°Qi (Qo e ^nil^),Qi e MQo^x)) for the 

extended wiretap channel (17) is defined as follows. 



8 

ei(m) 



Urn '^jlm 
\^jlm} jlrn 



To 



In words, it is the set of codewords Cjim indexed by the 
messages j E J, I E L and m G M. Each codeword consists 
of an n-sequence Um that belong to the random codebook Qq, 
and an n-sequence Xjjm that belongs to the random codebook 
01 (m). The codewords from are selected u.i. from the 
type class Tq^ and the codewords from 0i(r7i) are selected 
u.i. from the Qi-shell TQj(Um) of Um- 

This approach of random code construction is well-known 
in the asymmetric broadcasting channel setting. Go is used to 
partition A:"" into cells/clouds {TQj(Um)}m that are intended 
to be well distinguishable through the channels of both Bob 
and Eve, and Qi{m) are the set of codewords selected from 
the containing cell that are intended to be well distinguishable 
by Bob but not necessarily so by Eve. The addition of input 
symbol from U gives an additional degree of freedom in 
optimizing the average performance of the code. 
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It is important to note that, unlike the randomness in 
the stochastic encoding, the randomness in the codebook is 
known to all parties (Alice, Bob and Eve). The randomization 
happens in the code design phase before the public and private 
messages are generated for the transmission phase. 

With the structure of the codebook defined, we can now 
complete the specification of the coding scheme with the 
maximum mutual information (MMI) decoder for Bob and Eve. 
Consider a particular realization 9 of the random code 8. Let 
I{Q, V) denote the mutual information. 



(19) I{Q,V):=HiQV)-H{V\Q) 



see (II) 



(20) 



lixAy) :=/(P,,P, 



y\x) 



see (19),(6),(8) 



(22) Mz)=m 



3\m, I{u„i A z) 



max/fit A z) 



its location in 9 to decode (m, I). The decoding function can 
be defined as. 



(m, I) 



3\{mJ,j), I{cjirn Ay) = max/(cAy) 



Similarly, Eve locates the mutual information maximizing 
codeword in to decode m as follows. 



3\m^ I{um A z) — max /(it A z) 



Then, /(cAy), referred to as the empirical mutual information 
between x and y, are defined as. 



Suppose Bob observes y G 3^" through his channel. He 
searches for the codeword 9 that maximizes the empirical 
mutual information /(c A y).^^ If there is a unique Cjim that 
achieves the maximum, he declares m as the public message 
and I as the private message. More precisely, 

(21) (/.b(y) = (m,0 ^ 

3\{m,l,j), I{cji,n hy) ^ma.xI{cAy) 

Similarly, suppose Eve receives z. She searches for the unique 
Um that achieves the maximum m-Ay^ueOo ^{"^ ^ 2;).'" i.e. 



The encoder and decoders are functions of the codebook 9, 
i.e. f[9]{mj\c), (l)b[9]{y) and ^b[0]{z;9) etc.. However, for 
notational simplicity, the dependence on 9 will be omitted. 

Using the random coding scheme, we can further bound the 
fault probabilities (16) with the expected fault probabihties 
over the random code ensemble as follows. From (16a), the 
expectation of Cf, over the random code 8 is, 

E(e6(8))< J2 exp{-nD{V\\Wb\Q)}x 

/3(V,e,<E.J): = 



We will not need to assume any structure for other than 
the fact it has to be a deterministic list decoder with fixed list 
size A. ' ' The coding scheme without prefix DMC can now be 
summarized as follows. 

Definition IV.2 (Coding scheme). The coding scheme without 
prefix DMC for a realization 9 of the random code in Defini- 
tion IV. I is defined as follows. 

Encoding: Alice generates the junk data J uniformly randomly 
from {1, . . . , J} and encodes the common message m £ M 
and secret I G L into {Um,xjim) G 9. She only transmits 
Xjim through the channel. The encoding function is therefore, 

f{x\m,l) := i ^ ^ ■7''"J'jeJ ^ qj. equivalently 

I , otherwise 

f(m, I) := Xjini G 9i{m) , Vm £ M, I £ L 

Decoding: If Bob receives y, he finds a codeword c <E 9 that 
maximizes the empirical mutual information /(cAy) and use 

'Note that the optimal decoding rule is the maximum likehhood decoding 
instead. MMI decoding is adopted here for simplicity. 

'"One may think that Eve can search for the unique Cji^n that achieves the 
maximum maxcge I{c/\ z), and declare m as the public message. Because 
of the suboptimality of the MMI decoding and the random code construction, 
this choice turns out to be unfavorable. 

"it is clear, however, that the optimal 1/1 is an extension of the maximum 
likelihood decoding rule with A estimates instead of one. 



X E Avg 

\jeJ,ieL,meM 



MimJ)nTv{Cji^)\ 



\Tv{C 



'jlm ) 



<|r„(g)| max Clip {-nD{V\\Wb\Q)} Pie, <^>l) 
vev„(Q) 

< (n + l)!-^!!^! max exp{~nD(V\\Wb\Q)} P(e,<i>'b) 
ver„(Q) 

where the last inequaUty is due to the Type Counting Lemma 
\'^n{Q)\ < (n + The expectation of Ce and Se can 

be upper bounded similarly. By the union bound, 

Pr{ef,(8) > 3E(ef,(8)) or < Pr{ef,(8) > 3E(eb(8))} 
ee(8) > 3 E(ee(8)) or + Pr{e,(8) > 3 E(ee(8))} 

s,(8) > 3E(se(8))} + Pr{se(8) > 3E(se(e))} 

which is < 1 due to the Markov inequality Pr(A > a E(A)) < 
1/a for non-negative random variable A and a > 0. Thus, 
the complement of the event has positive probability, which 
implies existence of a realization 9 of Q such that the fault 
probabilities can be bounded simultaneously as follows, 

(23a) ebi9) < 3{n + 8, $g) 

(23b) ee(0) <3(7i + l)l-*ll^ls(W^e,8,$^) 
(23c) se{9) < 3(n+l)l'*ll^ls(M^e,8,*) 

where s is defined as follows. 



(24a) (3iV, 8, $) := E 



Avg 



1W\Q)}P{ 
^'(/) are the trivial 



(24b) s(W^, 8, $) := max eiip {-nD(y\\W\Q)} P{V,e,<S>) 



and ^e{m,l) :— ^e{m), \['(m, Z) 
extensions for all {m, I) G M x L. 

To compute the desired exponents, we consider a sequence 
of random codes defined as follows. 

This follows from the definition (8) that there are at most n + 1 possible 
values for each entry of a canonical conditional type, (see Type Counting 
Lemma 2.2 of [3].) 
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Definition IV.3 (Sequence of random codes). {9^"-'} or 
simply Q denotes a sequence of random codes 0^"^ (see 
Definition IV. 1) of type Q(") = Q^,"^ o Q^"^ (g^"^ e 
^n(W),(5i"-' e Xi((3o"\a')) that converges to distribution 
Q = QaoQi (Qo e ^iK),Qi e ^(A")") in variation 
distance, i.e. (5var(Q^"\Q) ^ 0, where 



Tviciii) 



Tv{c2i 



(25) 



ACX 



Furthermore, J*^") grows exponentially at the junk data rate 

(26) lim - log J(") =Rj>0 

If one can find 7fc(y, Q) continuous in Q o V in variation 
distance such that, 

lim inf - - log (") , e("^ , ($["^ )^) > 7b (F, Q) 

n^oo n 

for any Q*^"^ o converging to Q o in variation distance, 
then Bob's error exponent (5a) can be lower bounded as, 

Eb{e) = lim inf - - log s(l/(") , O*") , 

> min D{V\\Wb\Q)^lb{V,Q) 

and similarly for other exponents and Se in (5). 

Lemma IV.2. Ifjt,{V, Q), 7e(V, Q) and ^{V, Q) are continu- 
ous in the joint distribution QoV (with respect to the variation 
distance (25) ) and lower bound the exponent, 

liminf-ilog/3(F,e,$) 

n — >oo ft 

for random code and the cases $ equal to $g and 
respectively, then there exists a realization 9 of Q such that 



(27a) 


Eb{e) 


> 


min 








(27b) 


Ee(9) 


> 


min 








(27c) 


Se(9) 


> 


min 











In the sequel, we will compute ji,, and 7 to obtain the 
desired lower bounds of the exponents. 

V. Success exponent 

From Lemma IV.2, to obtain a lower bound of the achiev- 
able''' success exponent 5*6 (5c), it suffices to compute a lower 
bound j{V) on the exponent of the expected average fraction 
P{V, Q, 5') for any satisfying the guessing rate (4c). 

Consider first some realization 9 of the random code in 
Definition IV. 1 . 

= ,1^ L M Avg^ n Tv{cjim)\ by (24a) 



J|Ty(cin)| 



l,m 



since \Tv{cji„i) \ depends on Cjim only through its type Q (and 
n). The fraction can be made small if J2j l^^i^) nTy(cj;,„)| 
on the R.H.S. is made small for each / and m. Imagine '^{l) 

'^Achievable here does not refer to achievable by Eve, but achievable by 
Alice as defined in Definition III. 3. 



Not well spread Well spread 

Fig. 5. Effectiveness of stochastic encoding 



as a net that Eve uses to cover the shells {Ty{cjim) : j G J} 
owned by Alice as much as possible. Roughly speaking, since 
the net cannot be too large due to the list size constraint, Alice 
should spread out the shells as much as possible to minimize 
her loss. We will refer to this heuristically desired property of 
9 that the V-shells {Tv{cjim) : j ^ J} spread out for every 
V, m and I as the overlap property. ^"^ This is illustrated in 
Fig. 5, in which the configuration on the left has 1^(1)'"' 
Tv{cjii)\ three times larger than the one on the right. 

Intuitively, random code has the overlap property on average 
since it uniformly spaces out the codewords. This is made 
precise with the following Overlap Lemma. 

Lemma V.l (Overlap). Let Xj (j = 1, . . . , J) be an n- 

sequence uniformly and independently drawn from Tg"^ C 
A-". For all J S > 0, n> noiS, \X\\Z\), z e Z", Q e 

^n{X), V e fniQ^Z) such that [exp{n/((5, y)}J > J, we 
have. 



^ t{z e Ty(Xj)} > exp(n5) } < exp(- exp{n5)) 

where 1 is the indicator function and Hq is some integer-valued 
function that depends only on S and \X\\Z\. 

In words, the lemma states that the chance of having 
exponentially (exp{nS)) many shells (from {Tv{Xj) : j e J}) 
overlapping at a spot (z) is doubly exponentially decaying 
(exp(— exp(n(5))), provided that the shells are not enough to 
fill the entire space (Tqv C Z") they can possibly reside, 
(i.e. J < lexp{nI{Q,V)}\) For the case of interest, we will 
prove the following more general form of the lemma with 
conditioning. 

Lemma V.2 (Overlap (with conditioning)). Let Q := Qo o 

Qi (Qo e ^n{H),Qi G yniQo^X)) be a joint type, U be a 
random variable distributed over Tq^, and Xj (j = 1, . . . , J) 
be an n-sequence uniformly and independently drawn from 
TQ,{\i) C A"". For all J € 1+ , 5 > Q, n > no{6, \U\\X\), 

'^Though not explicitly stated, this notion of overlap property is also evident 
in [2] for the typical case when V is close to We- (See Lemma 2 of [2]) 
For the purpose of computing the exponent, we extend it to the atypical case 
of V and relax the extent that the shells have to spread out by allowing 
subexponential amount of overlap. 
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z g Z", Q Qo o Qi, V" e V„(Q,Z) rac/i that 

[exp{n/((5i, y|Qo)}J > >/, vve have. 



(28) Pr 



; 1 {2 e ry(U o Xj)} > exp(n,5) 



< exp{— exp(n^)} 
where o denotes element-wise concatenation (18), and 
I{QuV\Qo) ■.^HiQ,\Qo)-HiV\QooQ,) 
^ H{Qi\Qq) - H{V\Q) 
denotes the conditional mutual information, (cf. (19)j 



(29) 



Proof: For notational simplicity, consider the case when 
exp{n5) and exp{nI{Qi,V\Qo)} ai'e integers. '^^ Consider 
some subset J' of {1,...,J} with \J'\ = exp(n(5). Since 
the events z G Ty(U o Xj) {j — 1, . . . , J) are conditionally 
mutually independent given U — u ^ Tq^, 

= ^u(M)Pr{2 e rv(ttoXj)}°'^p("*') 

< exp |-n[/(Qi,y|Qo) - ^] exp(nJ) 

for n > nQ{S,\U\\X\), where the last inequality is by 
Lemma A.l using the uniform distribution of Xj and 
Lemma 1.2.5 of [3] on the cardinality bounds of conditional 
type class. Since exp{nI{Qi,V\QQ)} > J, the number of 
distinct choices of J' is, 

J \ ^ ^exp(n/(Qi,y|Qo)) 



where |a|+ := max{0, a} and \a\ := min{0,a}. 

Proof: By the Overlap Lemma V.2, for any S > and 

n > no{5). 



Ft I J2 l{2;eTy(Cj/m)}>exp(n<5) 60-^0 

< exp{— exp(n(5)} 

where Go is the codebook {li,n}meM, Sq is an arbitrary 
reaUzation, and {Jk{V)}k£Kv is a partitioning of {1, ... , J} 
defined as, 

Jk{V) := {{k - l)Jv + 1, . . . ,min{fcjy, J}} 
Jv [cxp{nI{Qi,V\Qo)}\ 
Kv := \J/M 

The expectation of the sum of indicators on the left can then 
be bounded as follows, 

— ']&Jk{V) 

< exp(rt(5) • 1 + J • exp{— exp{nS)} 

< exp(n2(5) 

where the last inequality is true for n > nQ{S, Rj, \U\\X\) by 
(26). Since Tv{Cjim) is contained by TQ^y(U„i), 



60 



^exp{nd) J \ cxp(ri(5) 

< exp{[loge + niIiQi,V\Qo) - S)] exp(n<5)} 

where the last inequality is by Lemma A. 2. By the union 
bound, L.H.S. of (28) is upper bounded by the product of 
the last two expressions, i.e. 



E ^[ J2 t{zeTv{C,im)} 

<exp{n2S)\^{l)nTQ,v{u,n)\ 
By linearity of expectation. 



60 = ^0 



J 

exp{nS) 



Prize Pi Tv(UoXj) 



< exp(n2(5) [^'(O HTq iy(Mm)| 
Summing both sides over k e Ky, 



Substituting the previously derived bounds for each term 
gives the desired upper bound exp(— exp(nJ)) when n > 
no{S,\U\\X\). m 
Consider now a sequence of random codes 8*^"-' defined 
in Definition 1V.3. The desired bound on the exponent of 
P{V, 9, ^E*) can be computed as follows using the Overlap 
Lemma. 

Lemma V.3 (Success exponent). Consider the random code 
sequence Q defined in Definition IV.3. For any sequence of list 
decoding attack ip satisfying the guessing rate R\ (4c), 

liminf-ilog/3(V^,e,*) 

> \Rl - Rx + \Rj - I{Qi,V\Qn)\-\ + 

"The case when cxp(n<5) and I{Qi , V\Qo) are not integers can be derived 
by taking their ceihngs or floors and grouping the fractional increments into 
some dominating terms. 



eo 



Summing both sides over I G L and applying the list size 
constraint on in Lemma A. 3 to the R.H.S., 



AT. 



jeJ,ieL 



< expin2S)KvX\TQ,viura)\ 



Averaging both sides over m G M, dividing by the constant 
JL\Tv{Cjim)\ and taking the expectation over all possible 
realizations of 60 gives, 

P{V, e, < exp n2(5 — — — — — - 

JL |Ty(Ciii)| 

To compute the desired exponent from the last inequality, 
denote the inequality in the exponent ^ as follows, 

(30) a„ ^ bn 4=> lim sup — log a„ < lim inf — log &„ 



10 



Then, Ky ^ exjp{n\R,j - I{Qi,V\Qq)\ + }, J ^ exp{ni?j} 
by (26), L > exp{ni?i} by (4b), A ^ exp{ni?A} by 
(4c), and |TQ,y(Mi)|/|ry(cin)| is ^ exp{n/(Qi, F |Qo)}. 
Combining these, /3(V, 0, is ^ the following expression, 

exp{n[i?L - i?A + [Rj - HQ, V\Qo)] 

-\Rj-I{Q,V\Qo)\+]} 

To obtain the desired bound, simphfy this with the identity 
\a\- = a - |a|+, and the fact that f3(y, 6, < 1. ■ 

VI. Error exponents 

The desired error exponents can be obtained directly from 
the achievability result in [6] by grouping {j, I) G J x L 
as one private message for Bob. This is because the error 
exponent that Bob decodes the private message wrong lower 
bounds the exponent that Bob decodes the secret wrong. '^^ For 
completeness, we provide a similar derivation in this section. 
Readers familiar with [6] and may skip to the next section. 

In essence of Lemma IV.2, the error exponents for Bob and 
Eve can be obtained by lower bounding the exponents of the 
fractions /3(F, 8, $g) and respectively f3{V, 9, $g). Thus, the 
objective is to prove the following lemma. 

Lemma VI.l (Error exponents). Consider the sequence of 
random code Q in Definition IV.3, and the MMI decoder 
(decision region map) (pt (21) and 4>e f'&ej (22) for Bob 
and respectively Eve. Then, 

Miaini --\og(3{V,Q,<^l)) 



where 



< 



I{QuV\Qo) - Rj -Rl + |/(Qo, QiV) - Rm\ 



liminf — log/3(V^,e,$^)) < \I{Qo,QiV)-Rm\ 

n—^oc Tl 



A. Exponent for Bob 

In essence of Lemma IV.2, the error exponent for Bob can 
be obtained by lower bounding the exponent of the fraction. 



/3(T/, e, <i>g) = E Avg 



\<^l{m,l)nTv{Cjim)\ 

\Tv{Cjlm)\ 



where Q is the sequence of random codes in Definition IV.3 
and $6 is the decision region of the MMI decoder (pb in (21). 
$g(m, I) n Tv{Cjim) is the set of bad observations in the V- 
shell of Cjim that lead to error if Cjim is transmitted. With 
the MMI decoder (21), this corresponds to the set of y G 
Tv{Cjirn) that has /(Cj/m A y) no larger than I{Cj'i>m' A 
y) for some misleading codeword Cjn'm' where j' G J and 
{l',m') e Lx M\{l,m}. i.e. 

if, I', m!) G Wl^\m) U W''^^\m, l),V' G Vf,(F)} 

'^Since Bob can also decode the junk data as reliably as the secret, one 
may potentially transmit meaningful data instead of the junk provided that 
the data is uniformly random and need not be secured at the same level as 
the secret. 



Vh{v) {V G MQ) ■■ HQ, V) > HQ, V)} 

Wi'\m) {(/, I', m') : j' e J,l' e L, m' G M \ {m}} 
(m, I) := {(/, I', m) : j' e J,l' e L\ {I}} 

(The dependence on V, m and I will be omitted if there is no 
ambiguity.) Vb is the set of problematic conditional type that 

can lead to error. (VV^^\ IV^'^^) forms a partition of the set 

(1) 



of indices for the misleading codewords. In particular, >Vj, 
corresponds to the indices of misleading codewords that result 
in decoding the public message wrong if the observation lies in 
a problematic y'-shell of the misleading codeword. Similarly, 
Wf, corresponds to the indices of misleading codewords that 
result in decoding the private message wrong but decoding the 
public message correctly.'^ By the union bound, 

|$g(m,OnTv(C,h„)| 

^ J2 \Tv{Cjim)nTv,{C,>i>^>)\ 

+ ^ ^ \Tv{Cjlm) <^Tv'{Cj'l'm')\ 

Consider the second summation where U„i' = Dm because 
to' — TO. Since Tv{Cjim) riTv'{Cj'i'm') is contained by 

^Qlv(Um) n rQjy'(U.r„) C Tg^/ H Tgy/ 

the summand is zero if QV ^ QV or QiV ^ QiV by 
the uniqueness of (canonical conditional) types. Since the 
premise implies I{Qo,QiV) = I{Qo,QiV'), we can impose 
this constraint (temporarily) in the second summation without 
affecting the sum. Under this equality constraint, however, 
the inequality constraint I{V\Q) > I{V'\Q) on V' can be 
replaced by I{Qi,V\Qo) > HQuV'lQo)- Withdrawing the 
equality constraint gives the following upper bound, 

\^l{m,l)nTv{Cji^)\ 

< E 

v'ey{Q): 



\Tv{Cjlm) riTv'{Cj'l'm')\ 



I{Q.y)>I{Q,V') 

E 



{j'l'm')eW^ 



(1) 



\Tv{Cjlm) n Ty (Cj'Cm') 



>I{Ql,V'\Qo) 

To bound the expectation on the left, it suffices to bound 
the expectation of \Tv{Cjim) <^Tv'{Cj'i'7n')\ on the right by 
the Packing Lemma[3], which is stated in a convenient form 
with conditioning in Lemma A. 4. 

If (j'jZ'jto') G wl^\m), then Cjim is independent of 
Cj'i'rn'- Applying the Packing Lemma without conditioning 
gives, for all (5 > 0, n > no(^, IZ^II-^I). 

|Tv(C,i„)nTv„(Cj,,,„,)| 



E 



I) <exp{-n[/(Q,r)-J]} 



If {j' , I' , to') G (to, I) instead, then Cjim is conditionally 
independent of Cj'i/m' given The Packing Lemma gives. 



E 



"The reason for this separation is that the two types of error lead to tvt'o 
different exponents. 
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Combining the last three inequalities, we have for n suffi- 
ciently large that. 



E 



|$g(m,Onrv(Cj/™)| 



\Tv{Cjlm)\ 



< JLMexp{-n[I{Q,V) - S]} 
+ JLei^v{-n[I{Qi,V\Qo) - 6]} 

where we have used the fact that |W'^''(m)| = JL{M — 1) and 
|>V(2)(m,0| = J(L-l); replaced /(g,y') and /(Qi, |Qo) 
by their minima I{Q, V) and respectively I{Qi, V\Qq) which 
correspond to the most slowly decaying terms; and applied the 
Type Counting Lemma to Hence, 



liminf \ogf3{V,e,^l) 

n— >oo n 



> |min {I{Q, V) ~ Rm, HQi, V\Qo)} - Rj - Rl\ 
= |/(Qi, ^IQo) - R.J -Rl + \I{QiV\Qo) - R 



M\ 



-1 + 



because min{a. b} = b + min{0, a — b}. 



B. Exponent for Eve 

The exponent of /3(V, 9, for Eve can be calculated 
analogously. With MMI decoding $g(m)nrv(Cj7m) is the set 
of 2; G Tv{Cjim) that has I{[JmAz) no larger than I{[Jm'Az) 
for some misleading codeword Dm' where m! ^ M \ {m}. 
i.e. 

^l{m)r\Tv{Cji„.) = {2 e ry(Cj7„0 nrQ,y,(u,„0 : 

elsem' G M \ {m}, V' G Ve{V)} 

where the set of problematic conditional types for Eve is 
V,{V) := {V G V(Q) : I{Qo,QiV') > IiQo,QiV)}. By 
the union bound, 

|$^(m)nTy(C,7™)| <^ ^|Ty(C,/™)nTQ,yKU™')l 

V'eVaiV) m'eM\{m} 

Since Cjim is independent of U„i' where m' ^ m, the Packing 
Lemma A. 4 without conditioning (but with Q assigned as Qq, 
and V assigned as QiV) gives, for all n > no{6, \U\), 



E 



\Tv{Cji,^)nTQ^^,(U^,)\ 
|Tv'{Cj,„0| 



<exp{-n[I{Qo,QiV')-5]} 



Substituting this into the previous inequality, we have for n 
sufficiently large that, 

V \-LV[^]lm)\ J 

where we have replaced /(QoiQi^^') by it minimum 
I{Qo, QiV). The exponent is therefore, 

liminf -i log /3(V^, 6,$^) > IHQo, QiV) ^ Rm\'^ 

n — ^oo Ji 

which completes the proof the Lemma VI. 1 

VII. Results 

The exponents of ^(y, 6, P{V, 9, $g) and P{V, 8, 
calculated in Lemma V.3 and Lemma VI. 1 using the random 
code in Definition I V.3 and the coding scheme in Defini- 
tion IV.2 give an initial set of lower bounds to the exponents 
by Lemma IV.2. As discussed in Section IV-B, the bounds can 



then be extended with prefixed DMC V by rewriting (Wt, W^) 
as {VWb,VW,). 

To obtain the final version of the bounds, consider the 
following rate reallocation: move the first R G [0, Rl] bits 
of the secret to the end of the public message, and encode 
them with a wiretap channel code at rate {Rm + R, Rl — R)- 

Theorem VII.l (Inner bound of achievable exponent triples). 

For every rate triple {Rm , Rl, R\)' we have for all R G 
[0, Rl], Rj > 0, finite sets lA and X, distribution Q :— 
Qo°Qi {Qq G 3^{l^)-,Qi e 3^{Xy^), transitional probability 
matrix V G 3^{X)^^'^, the exponent triple {Eb,Ef,,Se) 
satisfying the following is achievable (see Definition III. 3) for 
the wiretap channel {Wj,, W^}- 

Eh > min D{V\\VWb\Q) 

+ \I{Qi,V\Qo)-Rj-Rl + R 
+ \I{Qo,QiV)~RM-Rn-^ 



Ef. > min 



D{V\\VW,\Q) 



\I{Qo,QiV)-Rm-R\'^ 
min D{V\\VWe\Q) 

Rl-R-Rx + \R.j-I{Qi,V\Qo)1 



From this, we can compute an inner bound to the region 
of strongly achievable rate triple for which above inner bound 
to the achievable exponent triple are all strictly positive. To 
simplify notation, let (U,X, X, Y, Z) be some random vari- 
ables distributed as Qo{u)Qi{x\u)V{x\u, x)Wb{y\x)We{z\x). 
(Note that (U,X) ^ X ^ YZ.) Since information divergence 
-D(V^||M^) is zero at V = W and positive otherwise, the 
exponents are positive iff, for R G [0, Rl] and Rj > 

(31a) Rj + Rl - R < I{XAY\U) 

(31b) i?,7 + i?L < /(UXA Y) 

(31c) i?M +i? < /(U AZ) 

(31d) Rl-R> R\ 

(31e) Rl - R + Rj > R\ + I{XAZ\U) 

R and Rj can be eliminated without loss of optimality by 
the Fourier-Motzkin elimination [10] (see Lemma A. 5), which 
gives the following. 

Theorem VII.2 (Inner bound of strongly achievable rate 
triples). {Rm , Rlt R\) is strongly achievable for the wiretap 
channel {Wb : X ^ y,We : X ^ 2} if 

(32a) < i?A < Rl 

(32b) i?A < /(XAY|U) -/(XAZ|U) 

(32c) < i?M < /(U A Z) 

(32d) Rm + R\ < /(U AY) +/(XAY|U) -/(XAZ|U) 

(32e) Rm + Rl < /(X A Y|U) + min{/(U A Y),/(U A Z)} 

for some (U, X) ^ X ^ YZ with Py|x = Wb and Pz|x = We- 
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It is admissible to have U as a deterministic function ofY- and 
\U\<^ + min{\X\ - 1, 13^1 + \Z\ - 2} 
\X\ < \U\ (2 + min{|A'| - 1, \y\ + \Z\ - 2}) 

which implies U ^ X -> X YZ and /(UXAY) = /(XAY). 

The admissible constraints are obtained from [2] as de- 
scribed in Lemma A. 6. They can be imposed without changing 
the inner bound. Example A. 4 illustrates how to compute an 
inner bound of the achievable rate tuples using the Multi- 
Parametric Toolbox[7] in Matlab. 

The closure of the rate region of {Rj^j, Rj^, Rx) is indeed 
equivalent to the closure of the rate region of {Rq, Ri, Rg) in 
Theorem 1 of [2]. More precisely, we have the following. 

Proposition VII.3 (Equivalent rate region). Let TZ be the inner 
bound of strongly achievable rate tuples {Rm,RltR\) in 
Theorem VII. 2, and TZ' be the set of rate tuples that satisfies, 

(33a) 0<Rx<Rl 

(33b) i?A < /(XAY|U) -/(XAZ|U) 

(33c) < Rm < min{/(U A Y), /(U A Z)} 

(33d) Rm + Rl< I{X A Y|U) + min{/(U A Y), J(U A Z)} 

for some (U,X) X YZ with the same admissible 
constraints as TZ. Then, TZ = TZ' . 

Hence, TZ is convex by Lemma 5 of [2] and the closure 
of its projection on {Rm,Rl) is the rate region for the 
asymmetric broadcast channel by Corollary 5 of [2]. Suppose 
Wb is more capable[5] than We, i.e. J(X A Y) > /(X A Z) 
for all Px e ^{X). Then it is admissible to have X = X 
(i.e. no prefix DMC) by a straightforward extension of the 
proof of Theorem 3 in [2]. It also follows that < i?A < 
maxpx[/(X AY) - /(X A Z)] is the projection of TZ on R\. 
Assume the stronger condition that Wb is less noisy[5] than 
We, i.e. /(U A Y) > /(U A Z) for any U ^ X ^ YZ. Then, 
by Theorem 3 in [2], it is addmissible to have U deterministic 
in addition to X = X to obtain the projection on {Rl, R\)■ 
Proof of Proposition VII.3: Without loss of generality, 
consider some U — > X X YZ with W n A" = 0. Let 
Uq. be a random variable such that it is X with probability 
a and U with probability 1 — a, and that 1{Uq = X} is 
independent of (U, X, X, Y, Z).'** Then, Ui = X, Uq = U, 
Ua ^ X ^ X ^ YZ, 

/(Ua A Y) = (1 - a)/(U A Y) + aI{X A Y) 
/(X A Y|Uc.) = /(X AY) - /(U„ A Y) 

and similarly for Z. Thus, we can define TZa and TZ'^ as the 
corresponding rate polytopes defined by the linear constraints 
in (32) and (33) respectively. 

If we impose (33c) on TZq, the resulting polytope is the same 
as TZ'q because (32c) and (32d) are redundant under (32b) and 
(33c). Thus, TZo D TZ'„, which impHes TZ D TZ'. 

If J(U A Z) < /(U A Y), then (33c) is equivalent to (32c). 
By the previous argument, TZq ~ TZ'q. 

'^This proof technique is from tlie proof of Theorem 4.1 in [3, p. 360]. 




Rl 

Rm 

(c) Hull(7^;),7^^) 



Fig. 6. Tlo C Hull(7^;,,7^^) for the case /(UAY) < /(UAZ) < /(XAY) 

If /(X A Y|U) < /(X A Z|U), then both TZa = TZ'„ = 9 hy 
identical constraints (32b) and (33b). 

Consider /(U A Y) < /(U A Z) < /(X A Z) < /(X A Y). 
Choose a such that /(Uq A Y) = /(U A Z). The convex hull, 
Hull(7^Q, 7?.^), contains TZq primarily because the hyperplane 
of (32c) and (32d) for TZq intersects at, 

Rx = /(XAY) -/(XAZ) 

< /(XA Y|U„) -/(XAZ|Uc,) 

which is contained by the half-space (33b) (with non-strict 
inequality instead) for TZ'^. This is illustrated in Fig. 6. For 
comparison, TZq is plotted with blue dotted frame in each 
sub-figure. It is contained by the convex hull in Fig. 6(c) as 
expected. 

Finally, consider the case /(X A Y) < /(X A Z). Choose 
a such that /(U« A Y) = /(X AY) - /(X A Z|U).''^ Then, 
Hull(7?.Q, 7?^^) contains TZq primarily because the hyperplane 
of (32d) intersects with the plane R\ — at, 

/?M =/(XAY) -/(XAZ|U) 

which is contained by the half-space (with non-strict inequal- 
ity) of (33c) for TZ'e^. This is illustrated in Fig. 7. Hence, we 
have TZq a subset of Hu11(7?.q, TZ'^,) for some a € (0, 1), which 
implies TZ <zTZ' as desired. ■ 

VIII. Conclusion 

In doubt of a unifying measure of security, we have con- 
sidered success exponent as an alternative to equivocation 
rate for the wiretap channel considered in [2]. We replace 
the maximal code construction and typical set decoding in 
[2] with the random coding scheme and maximum empirical 

"if /(X A Z|U) = 0, choose a to approach 1 from below to ensure that 

n'^ + 0- 
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(c) }iu\\{'R.'o,n'j 

Fig. 7. 7^o C nu\\CR'g,n'J for the case I{X A Y) < /(X A Z) 



mutual information decoding in [6]. The lower bounds on the 
error exponents follow from [6] with the well-known Packing 
Lemma (see Lemma A. 4), while the lower bound on the 
success exponent is obtained with the approach of [4] and 
a technique we call the Overlap Lemma (see Lemma V.2). 
This lemma gives a doubly exponential behavior that enables 
us to guarantee good realization of the random code for 
effective stochastic encoding by transmission of junk data (see 
Section IV-B). Combining with the prefix DMC technique 
in [2] that adds artificial memoryless noise to the channel 
input symbols, and a rate reallocation step of transferring 
some secret bits to the public message before encoding (see 
Section VII), we obtain the final inner bound of the achievable 
exponent triples in Theorem VII. 1 with the corresponding 
strongly achievable rate triples in Theorem VII. 2. Proposi- 
tion VII. 3 shows that this inner bound to the rate region is 
convex and coincides with the region of achievable rate triples 
in Theorem 1 of [2]. 

It is a straightforward extension to consider the maximum 
error exponents and average success exponent over the mes- 
sages. The same bound follows by the usual expurgation argu- 
ment and a more careful application of the doubly exponential 
behavior of the Overlap Lemma. Whether this tradeoff is 
optimal, however, is unclear. It would be surprising if one can 
further improve the tradeoff by improving the coding scheme. 

Appendix 

Example A.l (Maximum a priori and aposteriori success 
probabihty). Consider the following probability matrix. 



Py 



Ps\z 



from which the a priori probability is Ps — [j \] - Without 
knowing Z, Eve guesses S successfully with probability at 
most I if one guess is allowed, and 1 if two guesses are 
allowed. If she knows Z, she still has the same maximum 
probability of success in each case because the most probable 
candidate for the secret is the same regardless of whether 
Z is observed. Hence, Eve cannot achieve a better success 
probability regardless of Z, even though Z is not independent 
of S. Success probability fails to express the notion of perfect 
secrecy in this sense. 

Example A.l (Transmission of junk data). Consider the case 
when there is no public message, and the coding is not 
restricted to constant composition code. Fig. 8 illustrates the 
approach of transmission of junk data through a wiretap 
channel, which consists of a binary noiseless channel for Bob 
and a binary erasure channel for Eve. While the channel 



X 

0- 



y 

-0 




(a) Channel Wi, to Bob 



(b) Channel to Eve 



I 3 Cji 



1 

1 

1 1 



00 

11 

01 
10 



f(0:=cj, ,J^Bern(.5) 
(c) Stochastic encoder / 
Fig. 8. An example of transmission of junk data 

input is perfectly observed by Bob, half of it is erased on 
average before it reaches Eve. Alice exploits this by sending 
one bit of junk J uniformly distributed in {0,1} together 
with one bit of secret I G {0,1} in two channel uses. The 
channel input is X — (J, J © /) where denotes the XOR 
operation. Bob can recover the secret perfectly by the decoder 



My) 



Q yC^) since his observation Y is equal to 



,(2) 



X. Eve can use the same decoding if there is no erasure. 
However, if there is one or more erasures, her observation Z 
becomes independent of the secret, in which case she should 
uniformly randomly pick or 1 as her guess to minimize the 
conditional error probability, provided that she can only make 
one guess.-" Thus, the conditional error probability is if there 
is no erasure, which happens with probability 1/4, and 1/2 
otherwise. The overall conditional error probability is 3/8. 

Note that if Alice uses a prefix DMC as described in 
Section IV-B, Bob cannot achieve zero error probability. In 

^"We allow stochastic decoding here since the focus is the probability at 
block length n = 2 instead of the exponent when n — » oo. 
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Other words, prefix DMC is strictly inferior in this case.-' 

Example A.3 (Prefix discrete memoryless channel). Consider 
prefixing the wiretap channel {Wb : X ^ y, We : X i-^ Z} 
with the discrete memoryless channel {V} defined in Fig. 9. 
Each arrow connects an input alphabet to an output alphabet if 
the corresponding transition probability, labeled in the arrow, 
is non-zero. Consider the case without prefixing the wiretap 



y 

= 




X 


X 


00 


GO 


01 


01 


10 


10 


11 


11 



(a) Prefix channel V 

X 



(b) Channel Wt to Bob 




X 

0- 



(c) Channel We to Eve 

y X 













3 




1^ 








-3 






1 








3' 
















3 










■ 1 




1 




3^ 



(d) VWt ('^^ ^^'^ 

Fig. 9. An example of prefix discrete memoryless channel 

channel with V. Since Wb is a weakly symmetric channel, the 
capacity is 1 bit by the capacity formula for weakly symmetric 
in Theorem 8.2.1 of [1]. Bob can achieve the capacity of 
1 bit with zero error probability and a single use of the 
channel iff Alice encodes 1 bit of information using any of 
the following codebooks 0{l) {GO, 10}, 9(2) {GG, 11}, 
e{3) := {01,10} and 61(4) := {01,11}. If Alice wants to 
have zero error probability for Bob in n channel uses with 
rate n bits, the codebook has to be some concatenation of 
codebooks from {d{i)}j^i- However, the channel input X" 
would not be independent of the channel output Z" to Eve. To 
argue this, consider the i-th channel use only. Suppose Alice 
uses 6{1) to encode a uniformly random bit at that time slot. 
Then, given Z'*^ = 0, we have X*^*' — 10 with probability 
2/3 rather than the prior probability 1/2. The other cases 
can be argued similarly. In short, not randomizing over the 

-'it would be more interesting to find an example in which prefix DMC 
is inferior even if Bob's probability of error cannot be made to by adding 
noise with memoiy like what the transmission of junk data does. 



code unavoidably leaks information to Eve. However, if the 
randomization is done by transmitting junk data, the useful 
data rate would drop below the capacity 1 bit. 

Consider prefixing the wiretap channel with V. The prefixed 
channel VWb to Bob is a noiseless binary channel as shown 
in Fig. 9(d). The prefixed channel VWe to Eve, however, is 
completely noisy as shown in Fig. 9(e). One can check that the 
channel output Z is independent of X for any input distribution 
on X. Thus, Alice can transmit at the capacity 1 bit with zero 
error probability for Bob but without leaking any information 
to Eve. Prefixing discrete memoryless channel is strictly better 
than transmitting junk data in this case. 

Lemma A.l (random codeword). For (5 > 0, n e Z+, Q ;= 

QooQi (Qo e ^„(W),Qi e r„(go,A')), V e r„(Q,Z), 

u G Tqq, n-sequence X uniformly randomly chosen from 
Tq-^ (u), then 



¥v{z e Tv{uoX)} = 



|7x|U,z(m,2:) 



<exp{-n[/(Qi,l/|Qo)-'5]} 

where the last inequality holds for all n > no{6,\U\\X\); 
(U,X, Z) in the first equality is a random tuple with joint 
distribution Pu,x,z ■— Qo°Qi°V; Tp^^^ ^ is denoted by 7x|u,z 
and similarly for others; and \Tx\ii.z{u, z)\ with (tt, z) G Ty.z 
is denoted by jTxiu.zl- 

Proof: Consider z G Tqy, for which the desired prob- 
ability is non-zero. Since u G Ty, X G Tx|u(m), and 
z G Tz, the event that {z G Tv{u o X)}, or equivalently, 
{z G Tz|u,x(w ° X)}, happens iff {u,X,z) G Tu^x.z- This 
happens iff X G T'x|u,z(''^7 z). Hence, for all z G Tz, 



Ft{z G TviuoX)} 



Pr{XGTx|u,z(«,^)} 

|rx|U,z(M,^)| 

|7x|u(w)| 



< exp{-n[/(XAZ|U) + (5]} 

where the last inequality is true for all n > no{S, \U\\X\) due 
to Lemma 1.2.5 of [3] that 

|rx|u,z(w,2)l <exp{niJ(X|U,Z)} 

l^xiuNI > (n + l)"l'^ll^lexp{ni/(X|U)} 

Since /(XAZ|U) — I{Qi, V\Qo), this gives the desired bound. 



Lemma A.l. For all n, exp(ni?), exp(n(5) G Z+ 

/exp(n^)\ ^ e + n{R~ S)) exp{n6)} 

\exp(no) / 

Proof: Let a :— exp{nR) and b :— exp{nS). Then, we 
have the well-known inequality that (^) < (|-e) , which gives 
the R.H.S. of the bound as desired. To derive this, note that 
>{l + x) for all a; > 0. Thus, 



E 

1=1 



> (l + x)" = 
Setting X — b/a gives the desired inequaUty. 



< e 



ax — bin X 
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Lemma A.3 (list size constraint). For any subset S C 

of observations and list decoder with list size X, the 

corresponding decision region map Vl/ : L i— > 2^ satisfies, 



(34) 



^|vI/(/)n5| = A|5| 



leL 



Proof: The proof is by the double counting principle, 
^ n 5| = ^ 5] 1{/ e ^{z)} = 5] A = A|5| 



Lemma A.4 (Packing (with conditioning)). Consider some 
finite sets lA, X and y, type Qq G ^„(W), and canonical 
conditional types Qi G 'y{Qa,X) and Qi £ 'f{Qo,X). Let 
Q ■— Qo ° Qi and Q := Qo o Qi be the corresponding joint 
types; U be some random n-sequence distributed over Tq^; X 
and X be independently and uniformly randomly drawn from 
rQj(U) flnt/Tg^(U) respectively; C := UoX and C :— UoX 
denote the element-wise concatenations. Then, for all (5 > 0, 
n > noiS, \U\\X\), V G r„(Q,3^) n 'KiQ^y), 



E 



|rv(C)nT^(C)| 

\TviC)\ 



< 



exp{-n[/(Qi,l/|Qo)-^]} 



Proof: Consider some realization u G Tq^ of U. By 
conditional independence between X and X, 



E |Ty(C)nT^(C) 



U = M 



J2 Pr{y Gry(MoX)}Pr{y gT^(moX)} 



< 



E 



IT) 



X|U,Y| 



1Ty|uI|Tx|u,yI 



X|U| 



exp{-n[/(Qi,y|Qo)-'5]} 



IT, 



exp{-n[/(Qi,l/|Qo)-^]} 



xiul 



where the first inequality follows from Lemma A.l (both the 
equality and inequality cases) Vn > no(5, IZ^HA"!) with U, 
X and Y and Tx|u,y etc. defined analogously. Divide both 
sides by |Ty(tt o X)| = |Ty|u.x|i and apply that fact that 
Ty|uI|Tx|u,y| = |Tx|uI|Ty|u,x|. 



E 



\Tv(QnT^{C)\ 
\TviQ\ 



\J = u) <e^p{-n[I{Qi,V\Qo)-d]} 



Averaging both sides over U gives the desired bound. ■ 

Lemma A.5 (Fourier-Motzkin). The rate constraints in (31) 
with R G [0, Rl\ and Rj > defines the same region of (non- 
negative) rate triples {Rm, Rl, R\) os the rate constraints in 
(32) do. 

Proof: Consider applying the Fourier-Motzkin elimina- 



tion. From (31) and R G [0,-Rl], we have, 

-i? < 
-R + Rj + Rl </(XaY|U) 
R- Rl<0 
R + Rm < /(U A Z) 
R-Rl + R\<0 
R-Rj-Rl + R\ < -/(X A Z|U) 
Rj + Rl + Rm </(UXAY) 

Adding each of the first two inequalities to the next four 
eliminates R, which, together with Rj > 0, gives, 

-Rj < 

-Rj -Rl + Rx< -/(X A Z|U) 

Rj + Rl + Rm < /(U A Z) + /(X A Y|U) 

Rj + Rx </(XAY|U) 

Rj + Rl + Rm < /(UX A Y) 

Rm < /(U A Z) 

-Rl + Rx < 

Rx < /(XA Y|U) -/(XAZ|U) 

where we have removed some inactive constraints. Adding 
each of the first two inequalities to the next three inequalities 
eliminates Rj, which gives (32) as desired. ■ 

Example A.4 (Inner bound of strongly achievable rate triples). 
Consider the following wiretap channel and prefix DMC. 

% wiretap channel 

p = 0.1; PYX =[ 1-p p ; p 1-p ; .5 .5]; % Py|x 
r = .4; PZX = [1 0; l-r r ; r l-i ]; % Pz|x 
% input distributions and prefix DMC 

PX = [.25 .25 .5]; PtX = PX; PXtX= eye (3); % Px. Pj^ and P^^j^ 
q = .3; PUtX = [1-q q; 1 ; 1]; % -Py|x 

The prefix DMC is noiseless, i.e. X — X. The channel and U 
are constructed based on Counter-example 2 in [5] with slight 
modifications.-- Define the Bayes' rule, conditional mutual 
information and entropy functions as follows. 

function PXY=bayes (PYX.PX) 
% compute Px \ y from Py | x and Px 
PXY=repmat(PX, size (PYX, 2) , 1 ). *PYX' ; 
PXY=PXY./ repmat (suni(PXY,2) ,1 , size (PXY,2)) ; 

function IQYP=1 (Q, V, P) % compute /(Q, V|P) 
if nargin<3 

P=ones(l , size (Q, 1 )) ./ size (Q, 1 ); 
end 

1QVP=H(Q*V , P)-H( V , P*Q ) ; 

function h=H(Q,P) % compute if(Q|P) 
Q(Q= = 0)=1; 

h=-P*sum(]og2(Q).*Q,2); 

Then, the mutual information expressions required for the rate 
region can be computed as follows. 



% derived values 
PU = PtX*PUtX 
PYtX=PXtX*PYX 
PZtX=PXtX*PZX 
1UY=I (PU.PYU) 
IUZ=1 (PU.PZU) 



PtXU = bayes (PUtX, PtX); foPuandP^iy 
PYU=PtXU*PYtX; % Pyij; and Py|u 
PZU=PtXU*PZtX; % P^ix and Pz|u 

ltXYU=l(PtXU,PYtX,PU); % 7(U AY) and /(X A Y|U) 
ltXZU=l (PtXU.PZtX ,PU); % 7(U A Z) and /(X A Z|U) 



-^This is such that the resulting constraints (32) on the rate region are not 
redundant for the purpose of illustration. 
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Using the Multi-Parametric Toolbox [7], we first define 
the poly tope satisfying the constraints from (31) on 
(R, Rj, Rm, Rl, Rx)', and then project it to {Rm, Rl, R\), 
which should give the desired region in (32). 

% constraints from (31) on (H, B.j , , Rl - R\ ) 
A=[-eye(5); -110 10; 100-10; 10100; 

1 -1 1; 1 -1 -1 1; 1 1 1 0]; 
b = [zeros(l ,5) ItXYU lUZ -ItXZU lUY+ItXYU]'; 
P=polytope (A, b ) ; 

R=projection(P,[3 4 5]); % Project to (_Rm , i?L : -Ra) to obtain (32) 

Finally, plotting the region gives Fig. 10. 

options . wire = 1 ; plot (R, options ) ; 

xiabel ( ' R_M' ) ; ylabel ( ' R_L' ) ; zlabel ( ' R_{ \lambdal ' ) ; 
set (gca , ' CameraPosition' ,[1.5 —0.5 1] .... 

' CameraUpVector' ,[ — 0.5 0.2 . 8 ] , ' DataAspectRatlo' , [ 1 1 1]); 



(32a) 



(32b) 




0.35 



Fig. 10. An example of an inner bound to strongly achievable rate tuples 

As expected, each facet corresponds to a constraint in (32), 
indicated in the figure. 

Lemma A.6 (admissible constraints). Consider some random 
variables in the Markov chain U'X' — > X' YZ distributed 
over the finite sets W, X' , X, y and Z respectively. Then 
there exists U — > X X ^ YZ with, 

Py\x{y\x) = Py|x' {y\x) , V(x, y)exxy 

Pz\xiy\x)=Pz\x'iz\x) Mx,z)&XxZ 

and 

(35a) 
(35b) 
(35c) 
(35d) 

and 

(36a) \U\^A + mm{\X\-l,\y\ + \Z\-2} 

(36b) \X\ = \U\ (2 + miii{\X\ - 1, \y\ + \Z\ - 2}) 

(36c) iJ(U|X) = 

Furthermore, X ^ X' if \X\ - 1 < \y\ + \Z\ - 2. 

Proof: Since the following proof is a minor extension to 
[2, (A.22)], we will give only the changes as follows. Readers 
should refer to [2] for details. 

With X" := (U,X'), we have /(X" A Y|U) /(X' A Y|U) 
and similarly for /(X" A Z|U). It suffices to show the desired 
existence with X' replaced by X" on the R.H.S. of (35). 

Consider the case \X\ — 1 < + \Z\ — 2. The admissible 
constraint (36) is equivalent to [2, (A.22)]. (n.b. V in [2] is 



/(U AY) =/(U'AY) 
/(U AZ) =/(U'AZ) 
J(XAY|U) =/(X' AY|U') 
/(XA Y|U) = /(X' AY|U') 



X here.) The proof therein also implies X = X'. because 
(X', Y, Z) need not be changed. 

Suppose lA:"! - 1 > |3^| + \Z\-2 instead. To achieve H{Y) 
and HiZ) in [2, (A.24), (A.25)], one can replace (A.23) by 



Pr(r = y) 



(37) 



ueu 



Pr{U = u}fy{pj 



Pr{Z = z) = J2Pr{U = u}f,{pJ 



where, using the notation in [2], 
fy{p)--^P^{y) and 



Only 1 3^ I — 1 of the functions fy {p) and | Z | — 1 of the functions 
fz{p) are considered. Thus, as a consequence of the Eggleton- 
Caratheodory Theorem, U takes at most (|3^| + |Z| — 2) + 4 
different values to preserve (A.24) to (A. 27) in [2] and 
(37) defined above. Similarly, (A. 28) can be replaced by 
the corresponding expressions on Pr(y — y\U — u) and 
Pr(Z = z\U = u). For every fixed u, there exists a random 
variable Vu with no more than {\y\ + |Z| — 2) + 2 values 
preserving the set of desired equalities. With X here playing 
the role of the new V in [2], (36) follows. ■ 
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